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I. INTRODUCTION 

Particle identification(PID) is important in high energy physics experiments, and it mainly refers to charged par- 
ticles. Different techniques [ij, |2, bl Ijlare used to study this problem, fn this paper, the problem is analysed with 
Bayes' theorem of probability theory. It is well known that the best classification methods are based on Bayesian 
techniques if all the probability distributions are knownP|. However, from a literature survey, it appears that how to 
use Bayesian technique in PID problem has not yet been thoroughly investigatedy,Q,y|. 

Different detectors use different variables to do PID, such as TOP t from TOP detector, dE/dx from wire chamber, 
deposited energy E from shower counter, Cherenkov radiation emission angle 9 from RICH counter, the deposited 
energy W or transition radiation(TR) photon hits N from TR detector, etc. Por different particles with same 
momentum, the random variables(i, dE/dx, E, 6, W , N etc) ^ may have different distributions which can be used 
for PID, therefore in this paper we call the random variables PID variables. Sometimes more than one PID variables 
which have different character can be obtained from one detector, such as in shower counter, both the deposited 
energy E oi a shower and one or two variables which describe the shape of the shower can be used for electron/hadron 
separation. 

Por an unknown charged particle, its momentum is usually known(e.g., given by drift chamber). Therefore, all 
the calculations of probabilities in this paper are under the condition that the particle's momentum vector is known 
and indicated with p, 6 and cf) which are magnitude, polar and azimuthal angles of the particle's momentum vector 
respectively. 

The paper is organized as follows: In section two, the PID problem is analysed with Bayesian technique when 
there is only one PID variable (use TOP t as example)is obtained for an unknown charged particle. In section three, 
similar analysis is done when two and more PID variables (use TOP t and the deposited energy E in shower counter 
as example)are available. Section four is the conlusions. 



'Electronic address; |tlanding@cugb.edu.cn| 

^ In this paper, random variable and its value are denoted with same symbol. 
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II. CASE FOR ONE PID VARIABLE 

For a fixed momentum denoted with the parameters p, 9 and 0, P{i){i — 1,2,3,4,5) are used to represent the 
appearing probabihties of particle e+, 7r+, K+, p+(or e~, /i^, tt^, K^, p^) ^ respectively, because five kinds of 
particles can have the same parameters' values p, 9, (f>. Here and below, i and j are used to represent one particle in 
e+, 7r+, K+, p+ or e^, /i^, tt^, K~, p~. And because only five kinds of particle can be the unknown particle, the 
appearing probabilities should be normalized to unit for the fixed momentum: 

1=1 

if some kind particle does not appear, the corresponding P(i) = 0; and if the number of charged particle kinds is 
larger than five(e.g., cosmic rays or particles from nuclear reaction), the sum terms will exceed five. 

When there is only one PID variable(e.g. TOF t), what we know is the TOF t of the unknown charged particle 
and the conditional probability P{t\i) which is the probability of TOF t, given that the unknown charged particle is 
i. From the point of view of probability theory, only the probability that the unknown charged particle is i can be 
determined. Then the PID problem can be written as follows: 

Given the momentum of the unknown charged particle and P(t\i), calculate P{i\t), 
where P{i\t) is the conditional probability that the unknown charged particle is i, given that the TOF of the unknown 
charged particle is t. In the light of the definition of conditional probability and Bayes' theorem, we have 

P{t\i)P{i) P{t\i)P{i) 



P(t) 5 

^ ' E pit\mj) 

Mt)dt-P{i) _ Mt )p{r) 



j:Mt)dt-p{j) tmpu) 

where P{j) is the appearing probability of the charged particle j, P{t) is the probability that TOF t occurs, and fj{t) 
is the probability density function(p.d.f.) of variable t for the charged particle j. The denominator in equation(|2Jl is 
the normalizing constant which only makes P{i\t) have the probability meaning. The probability P(i\t) is proportional 
to fi{t)P{i) in which fi{t) is determined by the detector, while P{i) has no concern with any detector. The p.d.f. for 
TOF t is usually a Gaussian distribution, i.e. 

/.(t) = -^exphii-^] (3) 

where ai is the resolution of TOF for the charged particle i, ijo is the expected value of TOF for the charged particle 
i. The general result for above pattern recognition can be easily found|5j. 

According to the physical meaning of P{i\t), after five values P{i\t) have been calculated, the reasonable hypothesis 
for the unknown charged particle is i which makes P{i\t) the largest in the five values. For any other PID variable 
X , if all p.d.f.s of TOF t in equation© are replaced with the corresponding p.d.f.s of variable X, equation© can be 



^ The unknown particle's charge is known. 

^ Because TOF t has continuous distribution, the values of P{t\i) are all infinitesimals. 
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used for PID variable X. But the PID variable X may not have a Gaussian distribution for every i as TOF t has in 
equationQ. For example, the deposited energy i? of a fixed momentum electron in EM shower counter has Gaussian 
distribution, while for 7r,i^, the deposited energy usually has not. 

In equation |(21), the p.d.f.s fi{t) (or P{t\i)) can be obtained from calibration of the detector. Thus the appear- 
ing probability(or prior probability)P(i) is the only unknown quantity. And it is P{i) that makes PID problem 
complicatedly because P{i) varies with studied final states. Here, we give some remarks on P{i). 

1. P{i) is the appearing probability of the charged particle i for studied final state and the momentum vector(p, 
6*, 0). This means different final states have different P{i), while different cuts(c.g. charged track number) in 
analysis result in different final states. For example, if all events of J/-0 decay are considered, we get a set of P{i) 
for the momentum vector (p, 9, (/)); for the same momentum vector (p, 9, 4>), if only those four-charged-tracks 
events from J/tp decay are considered, we will obtain another set of P{i). But why do we need a second set of 
P{i)? In fact, the second set of P{i) can be used to enhance the efficiency of PID when we select the events 
which only have four charged hadron tracks. In the four-charged-tracks events of J/tp decay, the appearing 
probabilities of leptons (e, /i) are by far less than that of hadrons(7r, K, p). If the second set of P{i) is used to 
select events, the affection of leptons will be reduced greatly. Therefore, analyses which use corresponding P{i) 
will have better event selecting. After a series of cuts are used to obtain P{i), the correct use of P{i) is that 
the cuts used in the event selecting should not be looser than those cuts used in obtainning P{i), because the 
P{i) can not be used to select the events which do not belong to the corresponding final state. Once P{i) has 
been figured out, it is not necessary to to change it when a new analysis is performed so long as the conditions 
which determine P{i) do not change. 

2. P{i) can be obtained from M.C. process. But a more reliable way of obtainning P{i) is recurrence approach in 
real data. M.C. results or theoretical values(if any) can be used as initial values. 

3. If the difference between P{i) is not large, PID will mainly rely on the inherent PID capability of detector, i.e., 
p.d.f.s of PID variable(e.g., fi{t) in equationQ). And if the difference between ai is neglected, we derive the 
conventional PID method(for TOF detector) which is only the contribution of exponential part (the weight of 
the unknown charged particle to be particle i) in equation lO: 

W^.^exph^^^^] (4) 

However, the difference between P{i) can not be neglected at will. For example, in the final states of J/ip decay, 
difference between the appearing probabilities of tt^, varies with momentum from several to ten times|9|. 
So it is valuable and more accurate to consider the effect of P{i) when large difference between P{i) exists. 
For example, if the weights of an unknown particle to tt, K are equal, i.e., = W4, one may have no idea of 
what the unknown particle is. But arccording to equation Q, the probability which the unknown particle is tt 
is several to ten times larger than the probability which the unknown particle is K . Furthermore, if W3 < W4, 
the particle will be identified to be K, but P{3\t) > ^^(41^) may occur because P(3) > -P(4), this suggests that 
the unknown particle is more likely to tt. Finally, if one does not use P(i), one may have set all P{i) a same 
value(equals 0.2)^1 which is groundless. 
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4. PID problem will become troublesome if P{i) depends on three parameters (p, 0, (/>). To reduce the number of 
the parameters is favourite. For final states come from the colliders which have equal energy particle and anti- 
particle colliding, it is not difficult to find that P{i) is independent of polar angle (f) because of axis symmetry, 
and because there are all kinds of channels in one final state(e.g. J/tp decay or four-charged-tracks final state 
in J/ip decay), P{i) may be independent of azimuthal angle 9. Thus, for the final states from most colliders, 
if the cuts of obtainning P{i) are loose enough, P{i) may only depend on one parameter p, the magnitude of 
momentum vector. In applications, the particle's possible momentum region can be divided into many small 
regions (e.g. 50MeV/c or less for a region's width). For every region, we have five values P{i). Then, for an 
unknown charged particle, P{i\t) in which we are interested can be calculated. 

Obviously the above procedure has no difficulty of correlations between particles mentioned in reference^. 



III. CASE FOR TWO AND MORE PID VARIABLES 



When there are two PID variables(e.g. TOF t and the deposited energy E in EM shower counter) for one unknown 
charged particle, then the PID problem can be written as follows: 

Given the momentum of the unknown charged particle, P{t\i) and P{E\i), calculate P{i\t,E), 
where E is the measured value of the deposited energy in shower counter, P{E\i) is the conditional probability that 
the deposited energy is E given that the unknown charged particle is i, and P{i\t,E) is the conditional probability 
that the unknown charged particle is i, given that TOF t and the deposited energy E occur simultaneously. By virtue 
of the definition of conditional probability, we have again 

PU\t E) ^(^'^'^^ - Pit,E\^)P{^) 
^^'1*'^^ - P{t,E) " P{t,E) 

where P{i,t,E) is simultaneous occurrence probability of i,t and E; P{t,E) is the probability that TOF t and the 
deposited energy E occur simultaneously; P{t, E\i) is the conditonal probability that TOF t and the deposited energy 
E occur simultaneously given that the unknown charged particle is i. Because measurements of TOF t and the 
deposited energy E are independent, we have 

P{t,E\i) = PmP{E\i) (6) 

Here, it should be noted that the situation of variable E is not the same as that of TOF t, the probability that E = 
may not be infinitesimal because of finite sensitivity of the detector, i.e. the distribution of E is not a pure continuous 
distribution, but a mixed one: 

ip(E^O\i) if£: = 0; 

Pirn = <^ (7) 

P{E = 0\i)]gi{E)dE iiE>0 

where gi{E) is the p.d.f. of variable E for the charged particle i when the deposited energy E > Q. If = for the 
unknown charged particle, then 

P{t,E = Q\i)P{{) P{t,E = 0\i)P{i) 



P{i\t,E^O) 
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mp{E = o\t)p{z) 



t f,{t)P{E = 0|j)P(j) 



(8) 



If -E > for the unknown charged particle, then 

P(i, E > 0\i)P{i) P{t, E > 0\i)P{i) 



P{i\t,E> 0) 



EP«,E>0|,,P(,, 

J = l 

P{t\i)P{E > 0\i)P{t) f,{t)dt ■ [1 - P{E = (^\i)]g,{E)dE ■ P{i) 
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E PmPiE > 0\j)P{j) E fAt)dt ■ [1 - P{E = 0\j)]gj{E)dE ■ P{j) 
^ Mm-P{E = 0\^)]9^{E)P{^) 
j:fj{m-P{E^O\j)ME)P{j) 

Similarly, the reasonable hypothesis for the unknown charged particle is i which makes P{i\t,E) the largest in the 
five values. 

Obviously, it is not difficult to generalize above calculation to the case of many independent PID variables. Two 
PID variables from two different detectors are usually independent. Furthermore, the method can be used all the 
same when a PID variable has discrete distribution (e.g. /i-detector hits probability), and using it is straightforward 
in this case. 

If two PID variables X and Y are correlative, the conditional probability 

P(X, Y\i) = MX, Y)dXdY (10) 
where fi{X,Y) is the joint p.d.f. of PID variables X and Y for particle i. Similarly, we have 

E fjiX,Y)P{j) 

Since the joint p.d.f. fi{X, Y) is difficult to obtain, the above eauation Hll(l is not very useful. 



IV. CONCLUSIONS 



By employing Bayes' theorem of probability theory, we have clarified the usage of all types of PID information. 
The corresponding applicable method to PID problem is also proposed. The method has some attracting properties. 
First, the final results (e.g., P{i\t), P{i\t,E)) are probabilities which have definite physical meaning. Second, when 
one PID varibale has no-Gaussian distribution (e.g. Landau distribution of dE/dx), this method can be used as well. 
Finally, the conventional PID method can be derived from it after some approximation. 
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